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CZP ' Abstract 

O: 

^ , Maximum likelihood estimation in statistics leads to the problem 

of maximizing a product of powers of polynomials. We study the 
algebraic degree of the critical equations of this optimization prob- 
lem. This degree is related to the number of bounded regions in the 
corresponding arrangement of hyper surf aces, and to the Euler charac- 
teristic of the complexified complement. Under suitable hypotheses, 
the maximum likelihood degree equals the top Chern class of a sheaf of 
(^ I logarithmic differential forms. Exact formulae in terms of degrees and 

cn ■ Newton polytopes are given for polynomials with generic coefficients. 

vn ■ 

^: 

^ ■ 1 Introduction 

O; 

^ I In algebraic statistics [131 1211 122] , a model for discrete data is a map f : M'' — »• 

c^ • M" whose coordinates fi,---,fn are polynomial functions in the parameters 

n ■ (9i, . . . ,6d) =■ 0. The parameter vector 6 ranges over an open subset U of M'^ 

>• . such that f{9) lies in the positive orthant ]R"o- The image f{U) represents a 

k> i family of probabihty distributions on an n-element state space, provided we 

i-j I make the extra assumption that /i + ■ ■ ■ + /n — 1 is the zero polynomial. 

A given data set is a vector u = {ui, . . . ,m„) of positive integers. The 
problem of maximum likelihood estimation is to find parameters 6 which best 
explain the data u. This leads to the following optimization problem: 

Maximize h{6Y^ f2{0Y^ ■ ■ ■ /n(^)"" subject to 6 eU. (1) 

Under suitable assumptions we have an optimal solution 9 to the problem 
(P), which is an algebraic function of the data u. Our goal is to compute 
the degree of that algebraic function. We call this number the maximum 
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likelihood degree of the model f . Equivalently, the ML degree is the number 
of complex solutions to the critical equations of (jT]), for a general data vector 
u. In this paper we prove results of the following form: 

Theorem 1. Let f\, ■ ■ ■ , fn be polynomials of degrees 61,..., 6„ in d un- 
knowns. If the maximum likelihood degree of the model f = {fi, ■ ■ ■ , fn) 
is finite then it is less than or equal to the coefficient of z'^ in the generating 
function 

^1^^ . (2) 

Equality holds if the coefficients of the polynomials fi are sufficiently generic. 

As an example, consider a model given by n = 4 quadratic polynomials 
in d = 2 parameters. The solution to (^ satisfies the two critical equations 

y^df^_ v^dfo^ v^dfs v^dU _ U]_df^ u^^df^^ u^df^ 'u^dU_ _ ^ 
/i de^ ^ /2 36^ ^ /s 36^ ^ /4 36^ ~ /i 36^ /a 36^ /g 36^ f, 36^ ~ ' 

If the /j's are general quadrics then these equations have 25 complex solu- 
tions. The formula for the maximum likelihood degree in Theorem ^ gives 

= l + Qz + 25^2 + 88;^^ + 280/ + ■ ■ ■ . 



:i - 2z) 



For special quadrics /«, the ML degree can be much lower than 25. A familiar 
example is the independence model for two binary random variables: 

/l = 0,62, /2 = (1-^1)^2, /S = ^1(1-^2), /4 = (l-^l)(l-^2). (3) 

Here the ML degree is only one because the maximum likelihood estimate 9 
is a rational function (= algebraic function of degree one) of the data u: 

p, Ml + M3 , a U1 + U2 

and 62 



Ui + U2 + U3 + U4 Ui + U2 + Us + U4 

This paper is organized as follows. In Section 2 we present the algebraic 
geometry for studying critical points of a rational function / = f]^^ ■ ■ ■ f^" 
on an irreducible projective variety X. The critical equations dlog{f) = 
are interpreted as sections of the sheaf Q}{log D) of 1-forms with logarithmic 
singularities along the divisor D defined by /. In Theorem |2 we show that 
if D is a global normal crossing divisor then the ML degree equals the degree 



of the top Chern class of Q^{logD). If X is projective (i-space then this 
leads to Theorem [T] In Section 3 we study the case when X is a smooth 
toric variety, and we derive a formula for the ML degree when the /j's are 
Laurent polynomials which are generic relative to their Newton polytopes. 
For instance, Example |H1 shows that the ML degree is 13 if we replace Q by 

fi = ai + PiOi + ^ie2 + 5A02 (z = 1,2,3,4). 

Section 4 is concerned with the relationship of the ML degree to the 
bounded regions of the complement of {fi = 0} in R"'. The number of these 
regions is a lower bound to the number of real solutions of the critical equa- 
tions, and therefore a lower bound to the ML degree. We show that for plane 
quadrics all three numbers can be equal. However, for other combinations of 
plane curves the ML degree and the number of bounded regions diverge, and 
we prove a tight upper bound on the latter in Theorem ^1 Also, following 
work of Terao [25 and Varchenko [^ , we show in Theorem ^J that the ML 
degree coincides with the number of bounded regions of the arrangement of 
hyperplanes {fi = 0} when the /j's are (not necessarily generic) linear forms. 

Section 5 revisits the ML degree for toric varieties, replacing the smooth- 
ness assumption by a much milder condition. Theorem ^J gives a purely 
combinatorial formula for the ML degree in terms of the Newton polytopes 
of the polynomials /«. This section also discusses how resolution of singular- 
ities can be used to compute the ML degree for nongeneric polynomials. 

Section 6 deals with topological methods for determining the ML degree. 
Theorem ^J shows that, under certain restrictive hypotheses, it coincides 
with the Euler characteristic of the complex manifold X\D, and Theorem 
Proffers a general version of the semi-continuity principle which underlies the 
inequality in Theorem ^ In Section 7 we relate the ML degree to the sheaf 
of logarithmic vector fields along D, which is the sheaf dual to Q^{logD). 

This paper was motivated by recent appearances of the concept of ML 
degree in statistics and computational biology. Chor, Khetan and Snir [7] 
showed that the ML degree of a phylogenetic model equals 9, and Geiger, 
Meek and Sturmfels |T3] proved that an undirected graphical model has ML 
degree one if and only if it is decomposable. The notion of ML degree also 
makes sense for certain parametrized models for continuous data: Drton and 
Richardson fTU] showed that the ML degree of a Gaussian graphical model 
equals 5, and Bout and Richards [S] studied the ML degree of certain mixture 
models. The ML degree always provides an upper bound on the number of 



local maxima of the likelihood function. Our ultimate hope is that a better 
understanding of the ML degree will lead to the development of custom- 
tailored algorithms for solving the critical equations dlog{f) = 0. There is a 
need for such new algorithms, given that methods currently used in statistics 
(notably the EM- algorithm) often produce only local maxima in ([T]). 

2 Critical Points of Rational Functions 

In this section we work in the following general set-up of algebraic geometry. 
Let X be a complete factorial algebraic variety over the complex numbers C 
We also assume that X is irreducible of dimension d>l. In applications to 
statistics, the variety X will often be a smooth projective toric variety. 

Suppose that / G C(X) is a rational function on X. Since X is factorial, 
the local rings Ox,x are unique factorization domains. This means that the 
function / has a global factorization which is unique up to constants: 

/ = F^^F^^---Fl^\ (4) 

Here Fi is a prime section of an invertible sheaf Ox{Di) where Di is the 
divisor on X defined by Fj. In our apphcations we usually assume that 
r > n where n is the number considered in the Introduction. For instance, 
if /i, . . . , /„ are polynomials and X = F'^ then r = n + 1; namely, Fi, . . . ,Fn 
are the homogenizations of /i, . . . , /„ using 9o, and F„+i = 9q (see the proof 
of Theorem^ for details). 

By (0]), we can write the divisor of the rational function / uniquely as 

r 

div{f) = ^MjA, 

where the Mj's are (possibly negative) integers. Let D be the reduced union 
of the codimension one subvarieties D^ C X, or, as a divisor, D := S[^^Dj. 

We are interested in computing the critical points of the rational function 
/ on the open set V := X\D complementary to the divisor D. Especially, 
we wish to know the number of critical points, counted with multiplicities. 

A critical point is by definition a point x G X where the differential 
1-form df vanishes. If x is a smooth point on X, and Xi, . . . ,Xrf are local 
coordinates, then df = J]'^^^{df /dxj)dxj. Hence x is a critical point of / if 



and only if 

K = K = ... = K = 0. (5) 

dxi dx2 dxd 

We next rewrite the critical equations (0) using the factorization (@)). Around 
each point a; G X, we may choose a local trivialization for the sheaf Ox{Di) 
and express Fi locally by a regular function. By slight abuse of notation, we 
denote that regular function also by F^. For instance, if X = P"' then this 
means replacing the homogeneous polynomial Fi by a dehomogenization. 

Since / has neither zeros nor poles on the open set V, the vanishing of 
df is equivalent to the vanishing of the logarithmic derivative 

dlogif) = — = ^Uidlog{Fi) = ^Ui —^. (6) 

We now recall some classical definitions and results concerning the sheaf 
of differential 1-forms with logarithmic singularities along D. The standard 
references on this subject are Deligne's book [9^ and Saito's paper |2S]- We 
define VL\{logD) as a subsheaf of the sheaf Vt\{D) of 1-forms with poles at 
most on D and of order one. This sheaf is the image of the natural map 

fi^©0^ — ^ ^\{D) 

which is given by the inclusion Vl\ —>■ Qx{D) and the homomorphisms 
sending 1 G Ox —>■ dlog{Fi). For experts we note that our definition differs 
from the one in J2S1 when D is not normal crossing. Saito's sheaf is the 
double dual of our fl]^[logD), which explains why his is always locally free 
when X is a surface [^ Corollary 1.7]. Ours need not be locally free even 
for surfaces. However, our definition gives a natural exact sequence. 

Lemma 2. If X is factorial and complete then we have an exact sequence 

r 

^ f]^ ^ n],{logD) ^ 0O^^ ^ 0. (7) 

Proof. The local sections of the sheaf Qx{logD) are rational 1-forms which 
can be written as a; = S^^j^t/'j ■ dlog{Fi) + rj, where ?7 is a regular 1-form. 

Ssince the Dj's are distinct prime divisors and X is factorial, the lo- 
cal rings Ox,Di are discrete valuation rings with parameter F^. Thus Fj is 



invertible in this local ring for j ^ i, and u is regular if and only if Fi di- 
vides ipi. This implies that the homomorphism which sends u to the vector 
(tpi {modFi^)i=i^,,,^r is well defined, and it induces an isomorphism from the 
quotient Q}x{logD)/Q}x onto ®I^^Od,. □ 

Assume now that X is smooth. Then both sheaves r2^(D) and Vl\ ^^^ 
locally free of rank d = di'm{X). Hence the intermediate sheaf VL\{logD) is 
torsion free of the same rank. Our next result shows that VL\{logD) is locally 
free if and only if the divisors Di are smooth and intersect transversally. 

Proposition 3. Let x E X be a smooth point, xi, . . . ,Xd local coordinates at 
X and Di, . . . , D^ the divisors which contain x. Then the sheaf VL\{logD) is 
locally free at x if and only if the h x d-matrix (dFi/dxj) has rank h at x. 

Proof. Any local section of Qx{logD) can be written in the form 

r h d 

uj = ^ipi- dlog{Fi) +7] = ^tpi- dlog{Fi) + ^ r^^ ■ dxj. (8) 

This observation gives rise to a local exact sequence 

- 0\^, -. 0\^^ © 01, -. n\^AlogD) -. 0. (9) 

The surjective map on the right takes ((V^i), {f]j)) to the sum on the right hand 
side of (jHl). The injective map on the left takes the /i-tuple (Ai, . . . , A^) to 

{{ipi), {rjj)) with i)i = FiAi and Vj = -^^i-^- 

1=1 ^ 

The exactness of the sequence follows from the proof of Lemma |21 If 
the section a; in (jSI) is identically zero in Q}^ xi^^d^) then uj is in particular 
regular, and so Fi divides each ipi. 

Now, since X is reduced, a coherent sheaf JF is locally free of rank d if 
and only if dim<cJ^ ® Cx = d for each point x. Since tensor product is right 
exact, it follows that this condition is verified for VL\{logD) if and only if the 
matrix of O^ -^ Ox © Oj^ , evaluated at x, has rank precisely h. Since the 
functions Fi, . . . ,Fh vanish at x, this is exactly the asserted condition that 
the Jacobian marix {dFi/dxj)i=i^,„h,j=i,...d has rank h aX x. D 



In the above situation where X is smooth and VL\{logD) is locally free 
we shall say that the divisor D has global normal crossings (or GNC). 

Theorem 4. Let X he smooth and assume that D is a GNC divisor. Then 

1. the section dlog{f) of VL\{logD) does not vanish at any point of D, 

2. if the divisor D intersects every curve in X (in particular, if D is 
ample) then dlog{f) vanishes only on a finite subset of V = X\D, 

3. if the above conclusions hold, then the number of critical points of f on 
V, counted with multiplicities, equals the degree of the top Chern class 
Cdin],ilogD)). 

Proof. We abbreviate a := dlog{f) = S[^^Mj dlog{Fi). By the proof of 
Proposition 01 it follows that if {dFi/dxj)i=i^,,,h,j=i,...d has rank h at x, then 
Qx{logD) is locally free of rank d with generators dlog{Fi) and some choice 
of d — /i of the dxj. If we write a in this basis, the coefficients of dlog{Fi) are 
the constants Ui while the coefficients of the dxj are some regular functions. 
The first assertion follows immediately since the exponents Ui are all nonzero. 

The second assertion follows from the first: let Z^- be the zero set of the 
section a. Since Z^ does not intersect D, it follows that dim{Z^) = 0. 

Thirdly, if JF is a locally free sheaf of rank d on a smooth variety X of 
dimension d, and a is a section of H^{T) with a zero scheme Z^- of dimension 
0, then the length of Z^ equals the degree of the top Chern class q(JF). D 

The total Chern class of a sheaf JF is the sum Ctoti,^) = ^f=QCi{T)z\ This 
is a polynomial in z whose coefficients are elements in the Chow ring A*{X). 
Recall that every element in A*{X) has a well-defined degree which is the 
image of its degree d part under the degree homomorphism A°'(X) — > Z. 

Corollary 5. Suppose that X is smooth and D is a CNC divisor on X which 
intersects every curve. Then the number of critical points of f , counted with 
multiplicities, is the degree of the coefficient of z'^ in the following polynomial: 

Ctot{n],)-Ul^,{l-zD,)-' G A*{X)[z]. (10) 

Proof. The total Chern class Ctoti^) is multiplicative with respect to exact 
sequences, i.e., iiO^A^B-^C — i>Oisan exact sequence of sheaves, 
then Ctot{B) = Ctot{A) ■ Ctot{C) ■ Hence the sequence (jTj) implies the result. D 



In the next section, we apply the formula (jlUj) in the case when X is a 
smooth projective toric variety. The Chow group A'^{X) has rank one and is 
generated by the class of any point. This canonically identifies A'^{X) with 
Z and so any top Chern class can be considered to be a number. 

Corollary 6. Suppose X is a smooth toric variety with boundary divisors 
Ai, . . . , As and D is GNC and meets every curve. The number of critical 
points of f , counted with multiplicity, equals the coefficient of z'^ in 

^24^ e A^iXM (11) 

n[=i(i-^A) ^ ^^ ^ ^ ' 

Proof. By virtue of equation (fTUj) we need only compute the total Chern class 
Ctot(^x)- For this we use the exact sequence in O page 87], 

s 

^ fi^ ^ n],{logA) ^ ^Oa, ^ 0, 
where A = Xl^i ^i? ^^^ ^^^ fact that Q]^{logA) is trivial. D 

3 Models defined by Generic Polynomials 

We now apply the results of the previous section to models f : M'^ — > M". To 
illustrate how this works, we first prove Theorem Q] for generic polynomials. 
The proof of the statement that the ML degree of generic polynomials is an 
upper bound on the ML degree of special polynomials (when this number is 
finite) is deferred to Theorem [7| which is a generalization of Theorem ^ See 
also Theorem 1221 where this semi-continuity principle is stated in general. 

Proof of TheoremU\ (generic case). The polynomials /i, . . . , /„ are assumed 
to be generic among all (nonhomogeneous) polynomials of degrees hi, . . . ,bn 
in 9i, . . . ,9d, and mi, . . . , m„ are positive integers. We take X to be projective 
space P"^ with coordinates (Oq : 9i : ■ ■ ■ : 9^). Our object of interest is the 
following rational function on X = P'^: 

^ — Ul i2 '''Jn )\n ' /) '•••'/) )■ 

The global factorization (jlj of this F has r = n + 1 prime factors, namely, 
F^ = ^^/.(^,...,§^) for^ = l,...,n, 
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and Fn+i = Oq with Un+i = —biUi — ^2^2 — ■ ■ ■ — hnUn- The Chow ring of 
X = P'^ is Z[if]/(ilf'^+^), where H represents the hyperplane class. By our 
genericity hypothesis, the r = n + 1 prime factors of F are smooth and global 
normal crossing. They correspond to the following divisor classes: 

Di = biH, D2 = b2H, ...,Dn = bnH and D„+i = H. 

Projective space P'^ is a smooth toric variety with d+1 torus-invariant divisors 
Aj, each having the same class H. Hence the formula in (jl Ij) specializes to 

(1 - zHY+^ (1 - zHY 



(1 - zhiH) ■ ■ • (1 - zhnH){l - zH) (1 - zhiH) ■ ■ • (1 - zhnH) ' 

Since we work in the Chow ring of projective space P*^, the coefficient of [zHY 
is the same as the coefficient of z'^ in the generating function in Q. D 

We now generalize our results from polynomials of fixed degrees to Lau- 
rent polynomials with fixed Newton polytopes. Recall that the Newton poly- 
tope of a Laurent polynomial f{9i, . . . ,9d) is the convex hull of the set of 
exponent vectors of the monomials appearing in / with nonzero coefficient. 
Given a convex polytope P G M.'^ with vertices in Z'^, by a generic Lau- 
rent polynomial with Newton polytope P we will mean a sufficiently general 
C-linear combination of monomials with exponent vectors in P fl Z*^. 

In the next theorem we consider n Laurent polynomials /i, /2, . . . , /„ hav- 
ing respective Newton polytopes Pi, P2, . . . , P„. Because the /j's are Laurent 
polynomials, i.e., their monomials may have negative exponents, we only 
consider those critical points of / = fi^f2^ ■ ■ ■ fii" which lie in the algebraic 
torus (C*)'^. The number of such critical points (counted with multiplicity) 
will be called the toric ML degree of the rational function /. 

Let P = Pi + P2 + ■ ■ ■ + P„ denote the Minkowski sum of the given 
Newton polytopes, and let X be the projective toric variety defined by P. 
Let r^i, . . . , r/s G Z'^ be the primitive inner normal vectors of the facets of P. 
They span the rays of the fan of X. Let Ai, . . . , A^ denote the corresponding 
torus- invariant divisors on X. Each of the Newton polytopes Pj is the solution 
set of a system of linear inequalities of the specific form 

Pi = {x eR'^ \ {x,rij) >-aij for j = l,...,s}. 

The divisor on X defined by the Laurent polynomial fi is linearly equivalent 
to Di = ^^^i^jjAj. The aij are integers which can be positive or negative. 



The divisor on X defined by / = /f ^ /^^^ ■ ■ ■ /"" is linearly equivalent to 

n s n 

^UiD, = "^C^Uittij) ■ Aj. (12) 

1=1 jf=l i=l 

We abbreviate the support of this divisor by 

n 

/ = {je{l,...,s} I 5^M,a,,7^0}. (13) 

j=i 

A toric variety X is smooth if all the cones in its normal fan are unimodular. 

Theorem 7. // the toric variety X is smooth and the toric ML degree of the 
rational function f is finite then it is bounded above by the coefficient of z'^ 
in the following generating function with coefficients in the Chow ring of X: 



n,^/(i - ^A,) 



(14) 



nr=i(i-^A)- 

Equality holds if each fi is generic with respect to its Newton polytope Pi. 

Note that Theorem ^ is the special case of Theorem [7| when Pi is the 
standard rf-dimensional simplex conv{0, Ci, . . . , e^} scaled by a factor of hi. 

Proof. Let us first assume that fi is a generic Laurent polynomial with New- 
ton polytope Pi. Let C[xi, . . . ,Xs\ be the homogeneous coordinate ring jH] 
of X with one variable for each torus-invariant divisor A^. Given a Laurent 
polynomial fi{6) with Newton polytope Pj, the corresponding rational func- 
tion on X is Fi{x)/x^^ where Di is as defined above and Fi is homogeneous 
of degree D^. Therefore the rational function on X we are interested in is 

F = x"^"'^' JjFi(x). 

We next show that the divisor of F is GNC Note that Fi is a generic 
section of a line bundle on X that is generated by its sections. This implies, 
by the Bertini-Sard theorem and by induction on n, that the divisors {Fi = 0} 
meet transversally in the dense torus of X. For points in the boundary of X, 
we simply restrict to the torus orbit determined by the corresponding facet 
where the restricted Fj's remain generic sections of the restricted bundles. 

The reduced divisor of poles and zeros oi F is D = J2 ^i + J2jei^j 
where / is defined as in (J13p . Since Yl^i i^ ^^^ divisor corresponding to 
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P it is ample on X by construction. So ^ Di meets every curve on X and 
therefore so does D and we can apply Corollary El A variable Xj appears as 
a factor in F if and only if j e / , in which case 1 — zAj appears in both the 
numerator and denominator of (fTTj) . and we get the expression (fT^ . 

Consider now arbitrary Laurent polynomials /i, ...,/„ in 6'i, ..., 6'^ such 
that / = n /i"' ^^^ only finitely many critical points in (C*)'^. Let u be the 
coefficient of z'^ in (J14j) . Let C™ be the space of all n-tuples of Laurent poly- 
nomials with the given Newton polytopes. Consider the critical equations 
of / = n /«"' ci^d clear denominators. The resulting collection of d Laurent 
polynomials defines an algebraic subset W in the product space C" x (C*)'^. 
Saturate W to remove any components along the hypersurfaces {/j = 0} and 
get a new algebraic subset W. The map from W onto C" is dominant and 
generically finite, and the generic fiber of this map consists of z/ points. 

Our given Laurent polynomials /i, . . . , /„ represent a point (p in C"*. Let 
9^^\ . . . , 9^*^^ be the isolated critical points of /. For each i, consider any 
irreducible component W^'^^ of W containing the point (0, 6'^*^) in PF C C" x 
(C*)*^. By KruU's Principal Ideal Theorem, the component W^'^^ of W has 
codimension < d and hence it has dimension > m. As the generic fiber 
is finite, the dimension of W^ is exactly m and the projection to C"* is 
dominant. Since 9^^^ is an isolated solution of the critical equations, the 
projection map to C™ is open P5l (3.10)], so the intersection of W^^^ with 
an open neighborhood of (0,6'*^*^) maps onto an open neighborhood of 0. 
Hence every generic point near has a preimage (0, ^W) near (0,^'-*)), 
and these preimages are distinct for i = 1, . . . , k. We conclude that k < v. 
This semicontinuity argument is called the "specialization principle" stated in 
Mumford's book [T^ (3.26)] and also works when the ^*^*'' have multiplicities, 
as shown in Theorem 1221 below. D 



We illustrate Theorem [7| with two examples which we revisit in Section 5. 

Example 8. Consider n generic polynomials fi{9i,92), ■ . ■ , /n(^i, ^2) where 
the support of fi consists of monomials 9^92 with < p < Si and < g < tj, 
and suppose the Mj's are generic. The Newton polytope of fi is the rectangle 

Pi = conv{(0,0),(s„0),(0,t,),(si,t,)}. 

The Minkowski sum of these rectangles is another rectangle, and X = P^ x 
P^. In the numerator of (fT^ . the contribution of the two torus-invariant 
divisors D and E corresponding to the left and the bottom edge of this 
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rectangle survives. The denominator conies from the product of the divisors 
of /i,...,/„: 

{l-zD){l-zE) 
(1 - {siD + tiE)z){l - {S2D + t2E)z) ■ ■ ■ (1 - {snD + tnE)z) ' 

Now, the coefficient of the term z^ modulo the Chow ring relations 

1)2 = 0, ^2 = 0, D-E = l 

gives the toric ML degree 






(15) 



k=l i=l 

Example 9. Let /i, /2, /s be generic polynomials in 9i and 62 with supports 

Al = {1,^1,^2,^?}, 
A2 = {1,61,62,0162,6^}, 

A3 = {1,^102,^2}- 

The corresponding Newton polytopes Pi, P2, -P3 are shown in Figure ^ 






Figure 1: Three Newton polygons 
The normal fan of the Minkowski sum has eight rays and is shown in Figure |21 

Theorem [3 applies because the toric surface X is smooth. We label the 
eight rays by xi, . . . , x^ in counterclockwise order, starting with (1, 0). The 
Chow ring A*{X) is the polynomial ring Z[xi, . . . , Xg] modulo the ideal 

(X1X3, X1X4, X1X5, XiXq, XiX-j, X2X4,X2X^, X2XQ, X2X7, X2X8, 
X3X5, X3X6, 0:3X7, X3X8, X4X6, X4X7, X4X8, X5X7, X5X8, XqXs, 

a^i — 2:3 — X4 — X5 + X7 + 2x8 , X2 + X3 — X5 — xe — X7 — X8 ). 
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Figure 2: The fan of a smooth projective toric surface 

The three divisors corresponding to the polygons Pi, P2, P3 in Figure Q] are 

Di = 2x3 + 2x4 + 2x5 + Xq 

D2 = 2X3 + 2X4 + 2X5 + X6 + X7 + X8 
-D3 = X4 + 3X5 + 2X6 + X7 

If all Ui are positive, then the support of the divisor UiDi + U2D2 + M3-D3 is 
/={3,...,8}. It follows that the toric ML degree is the coefficient of z^ in 

(1 - 2Xi)(l - zx2){l - zD^)-\l - zD2)-\l - zD^)-\ 

This coefficient is 14xiX2, which means that the toric ML degree is 14. D 

The toric ML degree of the model f is the toric ML degree defined above 
for generic u. In this case, there is no cancellation among the coefficients in 
(Hn)), and / is the set of all indices j such that for some P, the supporting hy- 
perplane normal to rjj does not pass through the origin. The toric ML degree 
of f is a numerical invariant of the polytopes Pi, ... , P„. A combinatorial 
formula for this invariant will be presented in Theorem 1151 of Section 5. 



4 Bounded Regions in Arrangements 

As in the Introduction, we consider n polynomials /i, . . . , /jj in d unknowns 
61, . . . ,6d- We now assume that all coefficients of the /j's are real numbers, 
and we also assume that Ui,...,u„ are positive integers. However, we do 
not assume that the union of the divisors of the /j's has global normal cross- 
ings. This is the case of interest in statistics. Consider the arrangement of 
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hypersurfaces defined by the /j's and let Vr = M'^ \ ljr=i{/« ~ 0} be the 
complement of this arrangement. A connected component of Vr is a bounded 
region if it is bounded as a subset of Mf^. Then the following observation 
holds. 

Proposition 10. For any polynomial map f : M'^ — i> M" and any u G N"o; 

^{bounded regions ofVm} 

< 4h{critical points of /"^ ■ ■ ■ /^" in W'-} 

< ML degree off. 

Proof. The function / = /"^ ■ ■ ■ /^" is continuous, and on the boundary of 
the closure of each bounded region its value is zero. Hence it has to have 
at least one (real) critical point in the interior of each region. The second 
inequality holds trivially, since the ML degree was defined as the number of 
critical points of /f ^ ■ ■ ■ f^" in C'', counted with multiplicities. D 

This observation raises the question whether the inequalities above could 
be realized as equalities. We next show that this is the case when /i, . . . , /„ 
are quadrics in the plane. Here the ML degree is 2n^ — 2ra + 1 by Theorem ^ 

Proposition 11. For each n, there are n quadrics /i, . . . , /„ m M^ such that 

^{bounded regions ofV^} = ML degree off = 2v? — 2n + 1. 
Hence all critical points are real. 

Proof. We will take n quadrics that define "nested" ellipses with center at 
the origin, as suggested by Figure El The proof follows by induction: assume 
we have 2(n — 1)^ — 2(n — 1) + 1 bounded regions with n — 1 ellipses. Observe 
that the (n — l)st elhpse contains 2?i — 3 bounded regions. Then we add a new 
long and skinny ellipse which replaces the 2n — 3 regions with 3(2ra — 3) + 2 
regions. The total count comes out to be 2n'^ — 2?t, + 1. D 

We will see such an equality holding for n linear hyperplanes in R*^ below. 
However, even in the plane R^, the number of critical points and the number 
of bounded regions of Vr diverge for curves of degree > 3. Theorem [T] implies 
that for n generic plane curves of degrees hi, . . . ,bn the ML degree is 



J2bi{bi-2) + Y^bibj + 1. 



i<j 
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Figure 3: The "nested" ellipse construction 

The optimal upper bound for the number of bounded regions of V^ is smaller 
than the ML degree, by the following unpublished result due to Oleg Viro. 



Theorem 12. (Viro) Let fi, ■ ■ ■ , fn be real plane curves of degrees bi, 
and let K he the number of odd degree curves among them. Then 



) "n; 



i^{bounded regions of V^} 



and this bound is optimal. 



< 



E^^M^^^E^A^i-A-. 



i=l 



i<j 



Proof. The proof is by induction. For n = 1 the bound above is Harnack's 
inequality [T3] , and it is optimal. Suppose the statement is proved for n — 1 
curves, and /„ defines a curve of degree 6„. We will take this new curve so 
that it has the maximum number of bounded regions allowed by Harnack's 
inequahty, i.e. -B„ := (6„ — 1)(6„ — 2)/2 + 1 if 6„ is even, and one less than 
that if bn is odd. One can achieve i?„ by taking a curve with Bn — 1 unnested 
ovals and one more distinguished piece (that gives an extra bounded region 
when bn is even) such that some line intersects this distinguished piece in 
exactly 6„ points. We can arrange this last curve in such a way that when 
we superimpose the distinguished piece on the arrangement given by n — 1 
curves, the last curve will intersect the curve given by fi in 6j6„ points. Now if 
we trace this last distinguished piece, every time we encounter an intersection 
point an extra bounded region is created, except the last point in case 6„ is 
odd. Together with the remaining B^ — I ovals we get the bound. D 
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In order to get any meaningful lower bound on the number of bounded 
regions of Vk one needs to make some assumptions. Without any assumptions 
the lower bound is zero: for /, of even degree we take an empty (real) curve, 
and for /j of odd degree we take the union of an empty curve with a line. If 
we let all the lines intersect in a single point there will not be any bounded 
region. If we insist on at least having a GNC configuration, then by the same 
construction the lower bound we get is the number of bounded regions in a 
generic arrangement of K lines where K is the number of odd degrees hi. 
This idea leads us to studying the ML degree of a hyperplane arrangement. 

Theorem 13. Let f he given hy n linear polynomials /i,...,/„ with real 
coefficients. Then the ML degree of f is equal to the numher of the hounded 
regions ofV-^, and all critical points of the optimization prohlem (Op are real. 

This theorem does not assume any hypothesis such as global normal cross- 
ing. Under the GNC hypothesis, the hyperplanes would be in general position 
and the number of bounded regions equals ("^^) , as predicted by Theorem [T] 
Theorem ^J is essentially due to Varchenko j23] . We shall give a new proof. 

Proof. In light of Proposition ^[ we need to show that the number of 
bounded regions of Vr equals the number of complex solutions of the ML 
equations. Let fi = ^.-^i aij9j + Ci for z = 1, . . . , n. The ML equations are 

n n 

Consider the map ^ : C'^+^ -^ C" given by V(^o, • • • , ^d) = (1/i^i, • • • , l/Fn). 
Here Fi = Ci9o + '^j^i^ijOj is the homogenization of fi. We let 7i be the 
central hyperplane arrangement in W^^^ given by the Fi. We assume that 
the intersection of all the hyperplanes in Ti is just the origin; otherwise, the 
linear forms Fi depend on fewer than d coordinates, and then we get infinitely 
many critical points. The Zariski closure of im(^) in P"-i is a rf-dimensional 
complex variety V. The solution set on V of the d linear equations 

n n 

'^{uiaii)yi = ■ ■■ = '^{uiaid)yi = 

consists of finitely many points provided ui, . . . ,Un are generic. Obviously, 
the solutions to (fT^ lift to such complex solutions. In other words, the degree 
of the projective variety V is an upper bound on the ML degree of f . 
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Now we will compute the degree of V. This variety is the projective spec- 
trum of the N-graded algebra R = C[l/ Fi : i = 1, . . . ,n\ where deg(l/Fj) = 
1. Terao |j24j Theorem 1.4] showed that the Hilbert series of R is equal to 

(, \ codim(X) 

Here /i is the Mobius function of the intersection lattice C of the arrangement 
7i. From this series we shall determine the leading coefficient of its Hilbert 
polynomial. This coefficient has the form e/d\, where e is the degree of V. 
For large enough r, the coefficient of t^ in the Hilbert series ()17|) equals 

E(-inI_i E MX). (18) 

j=0 ^ ^ codim(X)=i 

This is the Hilbert polynomial of the graded algebra R. Its leading term is 

(-i)'^V(o)^. 

We conclude that the degree of the projective variety V is (— l)'^"'"^/i(0). By 
Zaslavsky ^^, this number equals the number of bounded regions of Vr. D 



Example 14. A family of important statistical models where Theorem El 
applies is the linear polynomial model of 22j. Such a model is given by a 
polynomial in r unknowns x = (xi, . . . ,Xr) with indeterminate coefficients, 

d 

p{x) = '^Ojx''' {with ajeW), 
i=i 

together with n data points vi, . . . ,Vn G W . The model is parametrized by 

d d 

The ML degree is the number of bounded regions of this arrangement. D 
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5 Polytopes and Resolution of Singularities 

We now return to the setting of Section 3, with the aim of relaxing the 
restrictive smoothness hypothesis in Theorem [7| Our aim is to derive a 
combinatorial formula for the toric ML degree of any model f defined by 
generic Laurent polynomials satisfying a mild hypothesis. The derivation of 
Theorem ^J involves resolution of singularities in the toric category. In the 
end of the section we shall comment on using resolution of singularities for 
bounding the ML degree in general. 

Given a polytope P in W^ and a linear functional v on M*^, we write 

P" = {peP \ yp' eP : {v,p) <{v,p') } 

for the face of P at which v attains its minimum. Two linear functionals v 
and v' are equivalent if p^ = p^ _ xhe equivalence classes are the relative 
interiors of cones of the inner normal fan Sp. If a is a cone in Sp, or a is a 
cone in any fan which refines Ep, then we write P°" = P^ for v in the relative 
interior of a. If / is a polynomial with Newton polytope P then f" denotes 
the leading form consisting of all terms of / which are supported on P" . 

As in Section 3, let /i, ... , /„ be Laurent polynomials with Newton poly- 
topes Pi, . . . ,Pn C Mf^. Consider any fan E which is a common refinement 
of the inner normal fans Sp^, . . . , Sp^. Suppose r is a cone in S and let k 
be the dimension of (Pi + ■ ■ ■ + PnY- There exists a /c-dimensional linear 
subspace L of M"^ and vectors gi, . . . , g„ G M'^ such that qi + P[ lies in L for 
alH = 1, . . . ,n. The subspace L is unique and satisfies L{~\7J^ ~ Z'^. Let 
y( ■ , . . . , ■ ) denote the normalized mixed volume on the subspace L. Here 
"normalized" refers to the lattice L fl U^, as is customary in toric geometry 
[12]. For any /c-element subset {ii, . . . , ik] of {1,2,..., n} we abbreviate 

V{P,,,. . . , P,,; r) = V{qi, + P^^, . . . ,q,^ + PQ if codim(r) = k, (19) 

and V^(Pji, . . . , Pjj.; r) = if codim(r) > k. li k = d and r = {0} then we 
simply write V^(Pii, . . . , Pi^) for the mixed volume in ()19|). li k = and r is 
full- dimensional then (fT^ equals 1; this happens in the last sum of ^K^. 

We are now ready to state our more general toric ML degree formula. As 
in Section 3, let X be the toric variety corresponding to the Minkowski sum 
P = Pi + ■ ■ ■ + P„ and Ex the normal fan with rays rji, . . . ,r]s. We consider 
the function / = /"^ ■ ■ ■ f^". Each polytope Pj corresponds to a divisor Di 
so the divisor of f is D = 'Y^UiDi. Let / be the support of D as in ()13|) . 
Label the rays of S^ so that {1, . . . , r} are the indices not in /. 



For each subset J of {1, . . . , r} let tj denote the smallest cone of S which 
contains the vectors rjj for j G J . If no such cone exists then tj is just a 
formal symbol and the expression p9|) is declared to be zero for t = tj. The 
mild smoothness hypothesis we need is that every singular cone of S contains 
at least one ray from /. Equivalently all cones tj are smooth. 

Theorem 15. Suppose every singular cone of T^x contains some ray in the 
support of the divisor D. Then, the toric ML degree of the rational function 
f is hounded above by the following alternating sum of mixed volumes: 

l<ii< — <id<'n je{l,...,r} 

l<ii<-<itj_i<n 

+ Yl yiPn^■■■^P^.-.■^^n,n}) + ■■■ + 

{ilj2}C{l,...,r} 
l<il<---<i;j_2<n 

(-1)'^ Y ^(0;n.......})- (20) 

{iiv,id}C{l,...,r} 

Equality holds if each fi is generic relative to its Newton polytope Pi. 

Proof. In order to apply Corollary El we must resolve the singularities of X. 
For toric varieties this is done in two steps. First we get a simplicial toric 
variety without adding any new rays to the fan. Second we resolve the re- 
maining singular (but simplicial) cones by adding new rays. This procedure is 
described in detail in jT2] • Typically the first step involves taking the pulling 
subdivision at each ray in the fan. However, under the given hypothesis it is 
enough to perform pulling subdivisions only at the rays in the support of D 
to obtain a simplicial fan S^. This fine detail will be important below. Our 
hypothesis holds for this intermediate fan as well, and subsequently we take 
a smooth refinement Ex' of Sj^ by adding new rays in the relative interiors 
of each of the singular cones. Let vr : X' —* X he the induced map. 

We will show that we get no new critical points under the resolution. 
Hence the number of critical points can be computed on X' . We finally 
claim that the Chern class formula expands into the given combinatorial 
formula. 

We investigate critical points of the puUback of our rational function: 

F' = Tx*{F) = (a;-^"'"*(^'))Jj7r*(i^,(s)). 
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For generic /j, the same argument as in the proof of Theorem [7| shows that 
the reduced divisor of poles and zeros D' of F' is GNC. What we must show is 
that all critical points of F' on X' are off the exceptional locus hence actually 
critical points of F on X. 

There are two types of new cones in Sx'. The first come from the tri- 
angulation step. By our construction, any such cone must contain a ray rjj 
in the support of D. This ray corresponding to the strict transform under vr 
is in the support of ^Mi7r*(Dj), and its variable appears as a factor in F' . 
By part 1 of Theorem EJ F' has no critical points along the torus-invariant 
divisor A', hence no critical points on any torus orbit contained in A'. 

The second type of new cone comes from the desingularization step. These 
cones all contain at least one new ray tje corresponding to an exceptional di- 
visor A^; in X' . We will show there are no critical points on A^;. Equivalently 
we show that there are no critical points on each torus orbit contained in Ag. 

Given a torus orbit let te be the corresponding cone of E' containing r]E- 
There is some minimal cone r of S containing te- Let r' be any cone of S' 
containing te that is maximal with respect to being contained in r. Since 
T is refined in E' it must be a singular cone, and so by the hypothesis it 
has some generating ray in the support of ^ MjDj, or equivalently the linear 
function of this Cartier divisor is not identically zero on r. The pullback 
keeps the same linear functional which cannot be zero on the subset r' of r. 
As a consequence, r' contains a ray tjj in the support of ^Mj7r*(Dj). This 
means a;^ appears as a factor in F' for some nonzero integer c. 

If rjj is a generator of te-, then as above there are no critical points on 
A^- and thus no critical points on the orbit corresponding to te- Suppose 
on the other hand rjj is not a generator of te- Let xe^, ■ ■ ■ jXe^ be the 
variables corresponding to the generators of te in an afiine chart of X' that 
contains r'. Note that the variable Xj corresponding to r/j is not among these 
variables. Because r is the minimal cone of S containing te, the face P['^ is 
contained in the face P[, and hence it is contained in the face P[ . So {FlY'^ , 
obtained by setting xei , • • • , xe^ to zero, does not contain any of the variables 
corresponding to r'; in particular it doesn't contain Xj. On the other hand, 
F/ = {FlY'^ + G[ where G[ is in the ideal generated by xei, • • • , xe^- Since 
Xj is not among the xe, we have 

.dFl d{{F[Y-) _ ^^ 

^ dxj ^'^ dxj 
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where xe = means xei = ■ ■ ■ = xe^^ = 0. Hence we have 

We conclude that F' has no critical points on the torus orbit corresponding 
to te as desired. 

Thus the toric ML degree of / on X' is the same as that on /. Since F' 
has no critical points on the exceptional locus and D is ample on X, tt*{D) 
meets any curve off the exceptional locus and therefore the ML degree must 
be finite. It is computed in the Chow ring of X' as the coefficient of z*^ in 



nv,(i-^A')n.(i-^A 



Ek 



nr=i(i-^^o 

Here A' are the strict transforms of the Aj not in the support of 'Y^UiDi. 
The A^;^, are those exceptional divisors which are not in the support of 
Y^ Uin*{Di), and D'- are the proper transforms of the divisor classes of the Fj's. 
We can now expand our Chern class product replacing (1 — zD'j) in the 
denominator by 1 + zD'^ + z^(D^)^ + ■ ■ ■ + z'^{D'^Y in the numerator. The 
intersection product of any collection of prime torus-invariant divisors is the 
cycle of the cone they span or if there is no such cone. Hence the coefficient 
of z'^ is the sum of all terms of the form 

Here 1 < «i < ■ ■ ■ < id-c ^ n and r^ ranges over all dimension c cones of S' 
spanned by rays not in the support of Y'^i^i- This product is exactly the 
mixed volume V^(-Pji, . . . , Pia_^', t^)- 

To finish we note that if t'^ contains an exceptional divisor A^;^, the 
minimal cone r of S containing r^ must have dimension strictly larger than 
c. This is because t'^ does not have any rays in the support of ^ Mj-D^ hence is 
not maximal in r. As a consequence all of the faces P^'' have a translate that 
lies in a subspace of dimension strictly less than d — c and the corresponding 
mixed volume is 0. In conclusion, the exceptional divisors do not contribute 
to the top Chern class product and the formula reduces to the stated one. D 

In two variables we recover a particularly simple formula: 
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Corollary 16. Let /i, . . . , /„ be generic Laurent polynomials in two variables 
{9i, 62) with Newton polygons Pi, . . . , P„. // the origin lies on none of the 
lines spanned by edges of their Minkowski sum P = Pi + ■ ■ ■ + P„ then the 
toric ML degree equals the area of P plus the areas of each of the Pi. 

Proof. This is a special case of Theorem 1151 when no facets pass through the 
origin. Therefore the only term is 

n n 
i=l j=i 

The Euclidean area of each Pj is V{Pi,Pi)/2. The Euclidean area of the 
Minkowski sum P = Y,Pi equals \ Y,i V{Pi, Pi)+Ei<j '^(^i. Pj)- The stated 
formula is the sum of these expressions. D 

Now that we are equipped with the volume formulas in Theorem ^1 and 
Corollary El let us revisit the two-dimensional examples from Section 3. 

Example 17. The Newton polygons Pi, . . .,P„ in ExamplelHlare axis-parallel 
parallelograms. The first term in the formula (fTH|) is the area of their 
Minkowski sum Pi + ■ ■ ■ + P„, and the second term is the sum of the ar- 
eas of the Pi, as in Corollary E| The third and fourth term are the two 
correction terms stemming from the fact that the origin is a vertex of each 
Newton polytope. These terms disappear if we replace one fi by 6'16'2/j. 

The number 14 in Example El can also be derived using Theorem ^J The 
three polygons in Figure [T] have areas 1, |, and | respectively. The area of 
their sum is 15. The two divisors corresponding to xi and X2 pass through the 
origin and yield correction terms of 1 and 4 respectively. Finally we add back 
1 for the vertex at the origin. Altogether 1 + | + | + 15 — 1 — 4 + 1 = 14. D 

Our discussion so far indicates that we get the sharpest results when X is 
smooth and D is GNC. In the toric case the smoothness hypothesis could be 
largely removed as in Theorem ^1 The GNC condition can also be relaxed 
for certain other cases as we saw in the previous section. In general, if the pair 
(X, D) does not satisfy the smoothness and GNC hypotheses then we must 
appeal to Hironaka's theorem on resolution of singularities (see e.g. |16j). 
This furnishes a proper projective morphism it : X' —>■ X such that X' is 
smooth and 7[^^{D) has GNC. We need to compute the divisor 7c*{div{f)) of 
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the pullback of the function /. If D[ is the proper transform of the divisor 
Di and Ei, . . . ,Ek are the exceptional divisors of vr then 

r k 

Ti*{div{f)) = ^UiD'i + ^fijEj, 

where fij are certain (possibly negative) integers. These integers are fij = 
Til^^Uiirtij where rriij is the multiplicity of the full transform of D^ along Ej. 
The underlying reduced divisor is D' := Yll=i^i~^J2j-n.^o^j- The number 
of critical points is now gotten by applying Theorem |^ to {X', D') instead of 
(X, D). This procedure can be very complicated in practice. We illustrate it 
with a simple example. 

Example 18. Let rf = 2, n = 4, /^ = x, ^ = y, /s = (x - l)^ + (y- l)^ - 2, 
and /4 = (x + If + 2{y - 2f - 9. The divisor D is not a GNC divisor 
since at the origin all the four curves defined by /i, . . . , /4 meet. In order to 
resolve this singularity we blow up X = P^ at (0 : : 1) to obtain X' which 
is smooth. We note that Ctot(^x') is (1 — zE){l — zH){l — zH'f where H is 
the proper transform of the generic hyperplane section, E is the exceptional 
divisor, and H' is the proper transform of a line through the origin. 

We have four cases: we consider first the general case where ui + U2 + 
Us + U4 7^ and Ui + U2 + 2^3 + 2^4 7^ 0. In this case D' consists of 
the proper transforms of the four original curves, the exceptional divisor, 
and the pullback of the line at infinity. After cancellations, we just need 
to compute the coefficient of z"^ in n_c z)(i-c z) "W'here Ci and C2 are the 
irreducible divisors corresponding respectively to the circle and the ellipse. 
This coefficient is C^ + C| + Ci ■ C2. In X' the two curves intersect in three 
points, and their self-intersection also yields three points. Hence the ML 
degree is nine. 

In the special case where ui + U2 + U3 + U4 = 0, we need to compute the 
coefficient of z'^ in ^^_^^^-y_^^^^ , which is Cf + Cl + Ci- C2- E ■ d- E ■ C2. 
Since E-Ci = 1 the ML degree drops down to seven. liui + U2 + 2u-s + 2u4 = 
then the coefficient of z^ in r^_c,l;f(^^_c,,) is Cf + Cl + Ci-C2-H-Ci-H-C2. 
Since H ■ Ci = 2, the ML degree is five. Finally, if both ui + U2 + u^ + U4 
and U1+U2 + 2u-i + 2^4 are zero, then the coefficient of z"^ in nJ^; Jmi^c l) ^^ 
Cf + Cl + Ci-C2-H-Ci-H-C2~E-Ci-E-C2 + H-E: sinceV ■ e'= 0, 
the ML degree further drops down to three. 

The number of bounded regions of the complement of the four curves in 
M^ is seven. By Proposition EH this is a lower bound on the ML degree when 
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all Ui are positive. This example shows that, for specific negative values of the 
Mj's, the number of critical points may be smaller than this lower bound. D 

6 ML degree and Euler characteristic 

A well-known result in the theory of hyperplane arrangements [20] states 
that the number of bounded regions of a real arrangement equals the Euler 
characteristic of the complement of its complexification. The Euler charac- 
teristic is {—lY^^fi{0) where /x(0) is the Mobius number of the intersection 
lattice, and, by Theorem IT^ this is precisely the ML degree of the associated 
linear model. Here we extend this relationship between topology and the 
ML degree to statistical models which are given by nonlinear polynomials f]. 
Working in the general setting of Section 2, we shall prove the following: 

Theorem 19. Let X be a smooth complete algebraic variety over C of di- 
mension d, and let D be the reduced divisor associated to f = /"^ ■ ■ ■ /^" . 
Assume that the hypotheses (a), (b) and (c) below hold. Then the ML degree 
equals {—lYetop{X\D) where Ctop is the topological Euler characteristic. 

Invoking Hironaka's theorem on resolution of singularities, we fix a blow 
up TT : X ^ X such that X is smooth, and the rational function / pulls 
back to a proper morphism / onto Pj^. The three hypotheses are as follows: 

(a) The inverse image D' := n^^lD) of the divisor D can be written as 
D' = D + Dh where D is the support of the divisor div(/) of the 
pullback / of the rational function /, while Dh is the horizontal divisor 
consisting of the sum of all components of D' which map onto Pj^. 

(b) The restriction of / to Dh\D is a topological fiber bundle over C* = 
Pc\{0,oo}. 

(c) The number of critical points of / on X\D is finite. 

Remark 20. Hypothesis (a) is crucial and depends on the exponents Ui. For 
instance, consider the rational function f = [y — x'^){y + x^)"^ on X = P^. 
We get X by blowing up the origin twice. The exceptional curve of the 
first blow-up belongs to the fiber {/ = 1} and hence is not supported on 
div(/). Hypothesis (a) is not satisfied for this example. If we take instead 
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/ = (y — x^)(|/ + x^)"^ then hypothesis (a) is satisfied because the exceptional 
curve belongs to the fiber {/ = 00} and is hence supported on div(/), 

Hypothesis (b) implies that the cohomology ranks of Dh\D can be com- 
puted from that of C* and the fibers using Kiinneth's formula. The alter- 
nating sum of the ranks is zero for the fibers, and we get etop{DH\D) = 0. 
In fact, the hypothesis (b) could be replaced by the more general condition 
etopiDH\D) = 0. 

Any proper map f : X ^ C* for X smooth is a topological fiber bundle 
if it is a submersion, i.e. df ^ for all points in X. Therefore to check 
hypothesis (b), we need only find a controlled stratification (in the sense of 
Thom-Mather theory ^7j) of Dh into locally closed smooth sets such that 
for all points on each strata S, d{f\s) 7^ 0. In particular, this last condition 
will imply that the critical points of / on X\D are the same as the critical 
points of / on X\D. 

Proof of Theorem [TR - Our method follows closely the proof of the Lef- 
schetz hyperplane section theorem (cf. P1I21)- Moreover, since the comple- 
ment is not necessarily compact we shall use Borel-Moore homology fl] (see 
also [6j). We note that for compact spaces X the ordinary homology groups 
coincide with the Borel-Moore homology groups. In the Borel-Moore homol- 
ogy theory we have the following useful exact sequence to be used below: if 
X is locally compact, F is closed in X, and U := X\F, we have 

... ^ H,{F) ^ H,{X) -. H,{U) ^ .... (21) 

Thus in this situation the Borel-Moore Euler characteristic is additive: 

esMiX) = CBMiF) + CBAiiU). (22) 

Finally, if U is an even-dimensional orientable manifold then Poincare duality 
holds between Borel Moore homology and ordinary cohomology, and eBuiU) 
coincides with the topological Euler number CtopiU). In our situation we get 

etop{X\D) = etop{X\D') = eBM{X\D') = eBM{X\D) - eBM{D'\D). 

The last equation follows from (J22I). Hypothesis (b) implies eBM{.D'\D) = 
(see Remark 1201); ^^^ hence it suffices to show that the ML degree equals 

etop{X\D) = eBM{X\b) = etop{X\D). 
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In other words, we may now simply erase the tilde and consider the case 
when X is smooth and / defines a proper morphism X\D — > C*. 

Let C denote the set of critical points of / on X. By hypothesis (c), this 
set is finite and the ML degree equals its cardinality counting multiplicities: 

^^ = X^ /^P- 
pec 

The multiplicity fip of a critical point p of / is known as the Milnor number 
at p of the hypersurface Fp = {x & X : f{x) = f{p)}. Milnor jTHI showed 
that this algebraic invariant of a singularity has the following topological 
interpretation. Consider a coordinate chart around the point p and intersect 
the fiber F^ := {x\ f{x) = f{p) + e} with a ball of radius 6 around p. For 
e -C (5 this intersection is the Milnor fiber. Milnor jTHj showed that the Milnor 
fiber is homotopy equivalent to a bouquet of fip spheres of dimension d — 1. 

Each singular fiber is obtained (cf. 0^]) from a smooth fiber by replac- 
ing the Milnor fiber by a contractible set. The Borel- Moore exact sequence 
(j21|) implies that the Euler number of a singular fiber F' is obtained from 
the Euler number of a smooth fiber F by adding —{—lY~^'Ep^p'fip. 

Then the Euler number of the union of the singular fibers equals \C\ 
times the Euler number of a smooth fiber F plus the correction — (— l)'^~^/i. 
Applying Kiinneth's formula to the fiber bundle defined by / on X\D minus 
the union of the singular fibers, and then applying the additivity formula 
(1^ . we conclude that eBM{X\D) = etop{X\D) = (— 1)V as desired. □ 

Example 21. For another illustration consider Example ITHl with X = P^. 
The generic ML degree was 9 but it decreased by 4 when u = (ui, U2, Ms, U4) 
is a general solution of ui + U2 + 2^3 + 2u4 = 0. This is consistent with 
Theorem ITUl because for such u the divisor D loses the component at infinity. 
The difference is a projective line minus 6 points, which has Euler number 
—4. Consider our hypotheses when X is the blow-up of X at the origin. If 
U1 + U2 + U3 + U4 7^ then the exceptional curve is part of D and Theorem IT^ 
is valid. On other hand, if mi + ^2 + M3 + ^4 = then it maps to P^ under a 
rational map of degree > 2, so hypothesis (a) does not hold. The philosophy 
of this example is that, even if the divisor D is locally biholomorphic to an 
arrangement of hyperplanes, genericity of the exponents Ui may be necessary 
for the topological formula of Theorem^] to hold. D 

Theorems ^ [3 and El offer combinatorial formulae for the ML degree 
and hence (using Theorem IT!I|) for the Euler number of the complement of 
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an arrangement of generic hypersurfaces {/j = 0} in X = P^ or in a toric 
manifold. In each of these theorems, the combinatorial number becomes 
an upper bound for the ML degree when the coefficients of the /j's are spe- 
cial. This semi-continuity principle will be explained by the following general 
topological result, in which also the underlying manifold is allowed to vary. 

Theorem 22. Assume we are given a one-parameter smooth proper family 
Xt of complex manifolds over the unit disk B := {t E C : |t| < 1}, and a 
family of rational functions ft on Xt, such that 

1. for t 7^ the divisor D{t) defined by ft has GNC, and 

2. for t = the divisor -D(O) defined by /o has the same homology class 
as D(t) for the natural difjerentiable trivialization of the family Xt. 

Then the ML degree of /o is less than or equal to the ML degree of ft . 

In order to understand the second hypothesis, let us recall Ehresmann's 
Theorem ^T]: any proper submersion (p : X ^ B of different iable manifolds 
is a differentiable fiber bundle, i.e., if f/ is a sufficiently small open set in B, 
there is a local diffeomorphism between 0~^([/) and U x F, for a fiber F, 
and this diffeomorphism is compatible with the two projections to U. 

Sketch of Proof. Let X ^ B he the total space of the family {Xt}teB- We 
consider the function (j) : X ^' B x P^, given by 0(a;) = it{x), f{x)). 

Consider the locus r. G X given by the vanishing of the vertical differential 
of 0: this is the local complete intersection defined by the d partials Of /dxi = 
0, where xi, . . . ,Xd are local coordinates on the fibers provided by the Implicit 
Function Theorem. At each critical point poi fo, the locus S has dimension 1, 
and thus, in a neighborhood of p, the morphism S — > 5 is finite, whence its 
degree is locally constant. This estabhshes the desired semi-continuity. D 

In general, it might be difficult to show that a given rational function /o 
has a perturbation as above, or we might want to calculate the ML degree 
with more algebraic precision. The results in the next section may help. 

7 Logarithmic vector fields 

In this section we will show that the formula for the ML degree given by the 
logarithmic Chern number (Theorem HJ holds in greater generality. Return- 
ing to the setting of Chapter 2, we consider logarithmic vector fields along 
D. Again, our definition differs slightly from the one given by Saito 
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Definition 23. If D is a reduced divisor on a factorial variety X, the sheaf 
of logarithmic vector fields Qx{~^ogD) is the dual 7iomQ^{Q]^{logD) , Ox)- 

Recall that the tangent sheaf Qx is 'HomQ^{Qx,Ox), the dual of the 
1-forms on X. The inclusion of Q]^ i'^to ^xi^ogD), studied in Lemma |21 
dualizes to an inclusion of the logarithmic vector fields into the tangent sheaf. 

Proposition 24. We have the following exact sequence of sheaves on X: 

r 

^ Qxi-logD) ^ Qx ^ 0Oz5,(A) ^ Sxtl,jn],{logD),Ox). 

i=l 

If X is smooth, then the rightmost homomorphism is onto, and the total 
Chern class of the sheaf of logarithmic vector fields equals 

c,ot{ex{-iogD)) = nu(i + zA) ■ ^''^ 

Proof. Dualizing the sequence -^ Ox{—Di) -^ Ox -^ O^., -^ 0, we get 

HomoAOD^.Ox) = and Sxt},^{OD„Ox) = Oo^Di). 

Hence the f^xto^ -sequence gotten by dualizing the sequence ((Tj) has the form 

-> ^ Qx{-logD) ^ Qx ^ 
©:=i(^a(A) ^ £xt\,^{n\{logD),Ox) ^ £xt\,^{Sl\,Ox) ^ ■■■ 

This is the first statement of Proposition!^ If X is smooth then the cotan- 
gent sheaf fi^f is free, and we have Extp {Vl\.,Ox) = 0. The formula ()23p 
follows from the multiplicativity of the total Chern class. D 

Remark 25. The two leftmost maps in the exact sequence of Proposition l2^ 
characterize the logarithmic vector fields on X along D as those vector fields 
^ e Qx which satisfy ^(-Fj) = (modFj) for all i. In other words, for each 
i = 1, . . . ,h, the vector field ^ = T,'j^i^jd/dxj has the property that there 
exist functions ipi such that C,{Fi) := T.'j^^^jdFi/dxj = ipiFi. 

Remark 26. An interesting case will be the one where X is smooth and 
the sheaf SxtQ^{Qx{logD) , Ox) has zero-dimensional support and length p. 
Here the top Chern class of Qx{—logD) and the top Chern class of VL\{logD) 
differ by (— l)'^~^p, and the count in part 3 of Theorem 0] changes accordingly. 
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As in the proof of Theorem HI set a := dlog{f) and let Z^r be the sub- 
scheme of X defined by the vanishing of a. The restriction of Z^j to the open 
set V = X\D is the critical locus of /. The reason why the sheaf Qx{-~^ogD) 
is important is that it enters directly into the algebraic description of Zcr. 

Indeed, the section a of flx{logD) corresponds to an exact sequence 

^ Ox ^ Q}x{logD) ^ ^ ^ 0, 

where O^^ is the kernel of £^a;t0^(£^, Ox) -^ SxtQ^{Q\-{logD),Ox)- Hence 
the f^xto^-sequence gotten by dualizing the previous sequence has the form 

^ nomoA^,Ox) ^ Qxi-logD) ^ Ox ^ Oz^ ^ 0. (24) 

This leads to the following more refined formula for the ML degree. 

Theorem 27. Assume that X is smooth, D meets every curve, the sheaf 
Qxi^^ogD) is locally free, and Z„ does not intersect the divisor D. Then 
the number of critical points of f onV = X\D equals {—iycd{Qx{—iogD)). 



Proof. Using ()24|) , we can view Z^j as the locus of zeros of the locally free sheaf 
dual to Qx{—^ogD), whose top Chern class is {—lYcd{Qx{—^ogD)). D 



Remark 28. The sequence (j24j) gives the following description of the equa- 
tions defining the critical points. The ideal sheaf of the subscheme Z„ is 
generated by the functions Ti[^^Ui%l)i, where ^ varies among the logarith- 
mic vector fields and (V'l, . . ■ I'ipr) is derived from S, as in Remark 1251 This 
holds because the section a = dlog{f) factors through the homomorphism 
(95c ~^ ^\{^ogD), thus the homomorphism dual to the section a also factors 
through the dual homomorphism Qx{—iogD) — > Ox, C, ^— ^ {ipi, • • • , ^r)- 

We next present two examples which illustrate the hypotheses of Theorem 
071 Here X is a smooth surface (i.e. d = 2) with local coordinates x and y. 

Example 29. Let h = 2 with Fi = x and F2 = x"" — y". A vector field 
^ = a(x, y) ■ d/dx + 6(x, y) ■ d/dy is logarithmic if and only if a = xipi 
and h = (v/u) y ipi + A(x^ — y"), for some function A. Observe that 1IJ2 = 
vipi —y^~^ A. We conclude that the sheaf Qxi~^ogD) is locally free of rank 2. 
The origin does not belong to the subscheme Z^ if there is a function 
uiipi + U21P2 = {ui + v- U2)ipi — U2y^~^ A, for some tpi and A, which does not 
vanish at the origin. This holds if and only ii ui + v ■ U2 y^ 0. D 
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Example 30. Let h = 3 with Fi = x,F2 = y and F^ = x — y. Logarithmic 
vector fields ^ have the form xipi-d/dx + yip2-d/dy with xil)i—yip2 divisible 
hy X — y. This implies ipi = \ + y ■ ii, ip2 = ^ + x ■ ^, and ■j/'s = A, for any 
functions A,/x. Thus Qx{—logD) is locally free of rank two. The origin does 
not belong to the subscheme Z^ if and only if ui + U2 + u-s y^ 0. 
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